Search CORE

15 research outputs found

Inertial Stochastic PALM (iSPALM) and Applications in Machine Learning

Author: Hertrich Johannes
Steidl Gabriele
Publication venue
Publication date: 21/12/2020
Field of study

Inertial algorithms for minimizing nonsmooth and nonconvex functions as the inertial proximal alternating linearized minimization algorithm (iPALM) have demonstrated their superiority with respect to computation time over their non inertial variants. In many problems in imaging and machine learning, the objective functions have a special form involving huge data which encourage the application of stochastic algorithms. While algorithms based on stochastic gradient descent are still used in the majority of applications, recently also stochastic algorithms for minimizing nonsmooth and nonconvex functions were proposed. In this paper, we derive an inertial variant of a stochastic PALM algorithm with variance-reduced gradient estimator, called iSPALM, and prove linear convergence of the algorithm under certain assumptions. Our inertial approach can be seen as generalization of momentum methods widely used to speed up and stabilize optimization algorithms, in particular in machine learning, to nonsmooth problems. Numerical experiments for learning the weights of a so-called proximal neural network and the parameters of Student-t mixture models show that our new algorithm outperforms both stochastic PALM and its deterministic counterparts

arXiv.org e-Print Archive

Variational models for color image correction inspired by visual perception and neuroscience

Author: Batard Thomas
Hertrich Johannes
Steidl Gabriele
Publication venue: HAL CCSD
Publication date: 01/02/2020
Field of study

Reproducing the perception of a real-world scene on a display device is a very challenging task which requires the understanding of the camera processing pipeline, the display process, and the way the human visual system processes the light it captures. Mathematical models based on psychophysical and physiological laws on color vision, named Retinex, provide efficient tools to handle degradations produced during the camera processing pipeline like the reduction of the contrast. In particular, Batard and Bertalmío [J Math. Imag. Vis. 60(6), 849-881 (2018)] described some psy-chophysical laws on brightness perception as covariant derivatives, included them into a variational model, and observed that the quality of the color image correction is correlated with the accuracy of the vision model it includes. Based on this observation, we postulate that this model can be improved by including more accurate data on vision with a special attention on visual neuro-science here. Then, inspired by the presence of neurons responding to different visual attributes in the area V1 of the visual cortex as orientation, color or movement, to name a few, and horizontal connections modeling the interactions between those neurons, we construct two variational models to process both local (edges, textures) and global (contrast) features. This is an improvement with respect to the model of Batard and Bertalmío as the latter can not process local and global features independently and simultaneously. Finally, we conduct experiments on color images which corroborate the improvement provided by the new models

Wasserstein Steepest Descent Flows of Discrepancies with Riesz Kernels

Author: Beinert Robert
Gräf Manuel
Hertrich Johannes
Steidl Gabriele
Publication venue
Publication date: 04/10/2023
Field of study

The aim of this paper is twofold. Based on the geometric Wasserstein tangent space, we first introduce Wasserstein steepest descent flows. These are locally absolutely continuous curves in the Wasserstein space whose tangent vectors point into a steepest descent direction of a given functional. This allows the use of Euler forward schemes instead of Jordan--Kinderlehrer--Otto schemes. For

\lambda

-convex functionals, we show that Wasserstein steepest descent flows are an equivalent characterization of Wasserstein gradient flows. The second aim is to study Wasserstein flows of the maximum mean discrepancy with respect to certain Riesz kernels. The crucial part is hereby the treatment of the interaction energy. Although it is not

\lambda

-convex along generalized geodesics, we give analytic expressions for Wasserstein steepest descent flows of the interaction energy starting at Dirac measures. In contrast to smooth kernels, the particle may explode, i.e., a Dirac measure becomes a non-Dirac one. The computation of steepest descent flows amounts to finding equilibrium measures with external fields, which nicely links Wasserstein flows of interaction energies with potential theory. Finally, we provide numerical simulations of Wasserstein steepest descent flows of discrepancies

arXiv.org e-Print Archive

Alternatives to the EM Algorithm for ML-Estimation of Location, Scatter Matrix and Degree of Freedom of the Student- $t$ Distribution

Author: Hasannasab Marzieh
Hertrich Johannes
Laus Friederike
Steidl Gabriele
Publication venue
Publication date: 23/03/2020
Field of study

In this paper, we consider maximum likelihood estimations of the degree of freedom parameter

\nu

, the location parameter

\mu

and the scatter matrix

\Sigma

of the multivariate Student-

t

distribution. In particular, we are interested in estimating the degree of freedom parameter

\nu

that determines the tails of the corresponding probability density function and was rarely considered in detail in the literature so far. We prove that under certain assumptions a minimizer of the negative log-likelihood function exists, where we have to take special care of the case

\nu \rightarrow \infty

, for which the Student-

t

distribution approaches the Gaussian distribution. As alternatives to the classical EM algorithm we propose three other algorithms which cannot be interpreted as EM algorithm. For fixed

\nu

, the first algorithm is an accelerated EM algorithm known from the literature. However, since we do not fix

\nu

, we cannot apply standard convergence results for the EM algorithm. The other two algorithms differ from this algorithm in the iteration step for

\nu

. We show how the objective function behaves for the different updates of

\nu

and prove for all three algorithms that it decreases in each iteration step. We compare the algorithms as well as some accelerated versions by numerical simulation and apply one of them for estimating the degree of freedom parameter in images corrupted by Student-

t

noise

arXiv.org e-Print Archive

Generative Sliced MMD Flows with Riesz Kernels

Author: Altekrüger Fabian
Hagemann Paul
Hertrich Johannes
Wald Christian
Publication venue
Publication date: 19/05/2023
Field of study

Maximum mean discrepancy (MMD) flows suffer from high computational costs in large scale computations. In this paper, we show that MMD flows with Riesz kernels

K(x,y) = - \|x-y\|^r

r \in (0,2)

have exceptional properties which allow for their efficient computation. First, the MMD of Riesz kernels coincides with the MMD of their sliced version. As a consequence, the computation of gradients of MMDs can be performed in the one-dimensional setting. Here, for

r=1

, a simple sorting algorithm can be applied to reduce the complexity from

O(MN+N^2)

O((M+N)\log(M+N))

for two empirical measures with

M

and

N

support points. For the implementations we approximate the gradient of the sliced MMD by using only a finite number

P

of slices. We show that the resulting error has complexity

O(\sqrt{d/P})

, where

d

is the data dimension. These results enable us to train generative models by approximating MMD gradient flows by neural networks even for large scale applications. We demonstrate the efficiency of our model by image generation on MNIST, FashionMNIST and CIFAR10

arXiv.org e-Print Archive

Manifold Learning by Mixture Models of VAEs for Inverse Problems

Author: Alberti Giovanni S.
Hertrich Johannes
Santacesaria Matteo
Sciutto Silvia
Publication venue
Publication date: 27/03/2023
Field of study

Representing a manifold of very high-dimensional data with generative models has been shown to be computationally efficient in practice. However, this requires that the data manifold admits a global parameterization. In order to represent manifolds of arbitrary topology, we propose to learn a mixture model of variational autoencoders. Here, every encoder-decoder pair represents one chart of a manifold. We propose a loss function for maximum likelihood estimation of the model weights and choose an architecture that provides us the analytical expression of the charts and of their inverses. Once the manifold is learned, we use it for solving inverse problems by minimizing a data fidelity term restricted to the learned manifold. To solve the arising minimization problem we propose a Riemannian gradient descent algorithm on the learned manifold. We demonstrate the performance of our method for low-dimensional toy examples as well as for deblurring and electrical impedance tomography on certain image manifolds

arXiv.org e-Print Archive

PatchNR: Learning from Very Few Images by Patch Normalizing Flow Regularization

Author: Altekrüger Fabian
Denker Alexander
Hagemann Paul
Hertrich Johannes
Maass Peter
Steidl Gabriele
Publication venue
Publication date: 21/11/2022
Field of study

Learning neural networks using only few available information is an important ongoing research topic with tremendous potential for applications. In this paper, we introduce a powerful regularizer for the variational modeling of inverse problems in imaging. Our regularizer, called patch normalizing flow regularizer (patchNR), involves a normalizing flow learned on small patches of very few images. In particular, the training is independent of the considered inverse problem such that the same regularizer can be applied for different forward operators acting on the same class of images. By investigating the distribution of patches versus those of the whole image class, we prove that our model is indeed a MAP approach. Numerical examples for low-dose and limited-angle computed tomography (CT) as well as superresolution of material images demonstrate that our method provides very high quality results. The training set consists of just six images for CT and one image for superresolution. Finally, we combine our patchNR with ideas from internal learning for performing superresolution of natural images directly from the low-resolution observation without knowledge of any high-resolution image

arXiv.org e-Print Archive

PCA Reduced Gaussian Mixture Models with Applications in Superresolution

Author: AUJOL Jean-François
BERNARD Dominique
BERTHOUMIEU Yannick
HERTRICH Johannes
NGUYEN Lan
SAADALDIN Abdellatif
STEIDL Gabriele
Publication venue: 'American Institute of Mathematical Sciences (AIMS)'
Publication date: 01/01/2022
Field of study

Despite the rapid development of computational hardware, the treatment of largeand high dimensional data sets is still a challenging problem. This paper providesa twofold contribution to the topic. First, we propose a Gaussian Mixture Model inconjunction with a reduction of the dimensionality of the data in each componentof the model by principal component analysis, called PCA-GMM. To learn the (lowdimensional) parameters of the mixture model we propose an EM algorithm whoseM-step requires the solution of constrained optimization problems. Fortunately,these constrained problems do not depend on the usually large number of samplesand can be solved efficiently by an (inertial) proximal alternating linearized mini-mization algorithm. Second, we apply our PCA-GMM for the superresolution of 2Dand 3D material images based on the approach of Sandeep and Jacob. Numericalresults confirm the moderate influence of the dimensionality reduction on the overallsuperresolution result.Super-résolution d'images multi-échelles en sciences des matériaux avec des attributs géométrique

Oskar Bordeaux

WPPNets and WPPFlows: The Power of Wasserstein Patch Priors for Superresolution

Author: Altekrüger Fabian
Hertrich Johannes
Publication venue
Publication date: 05/05/2022
Field of study

Exploiting image patches instead of whole images have proved to be a powerful approach to tackle various problems in image processing. Recently, Wasserstein patch priors (WPP), which are based on the comparison of the patch distributions of the unknown image and a reference image, were successfully used as data-driven regularizers in the variational formulation of superresolution. However, for each input image, this approach requires the solution of a non-convex minimization problem which is computationally costly. In this paper, we propose to learn two kinds of neural networks in an unsupervised way based on WPP loss functions. First, we show how convolutional neural networks (CNNs) can be incorporated. Once the network, called WPPNet, is learned, it can very efficiently applied to any input image. Second, we incorporate conditional normalizing flows to provide a tool for uncertainty quantification. Numerical examples demonstrate the very good performance of WPPNets for superresolution in various image classes even if the forward operator is known only approximately

arXiv.org e-Print Archive